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Abstract — A class of doubly-generalized low-density parity- 
check (D-GLDPC) codes, where single parity-check (SPC) codes 
are used as variable nodes (VNs), is investigated. An expression 
for the growth rate of the weight distribution of any D-GLDPC 
ensemble with a uniform check node (CN) set is presented at first, 
together with an analytical technique for its efficient evaluation. 
These tools are then used for detailed analysis of a case study, 
namely, a rate- 1/2 D-GLDPC ensemble where all the CNs are 
(7, 4) Hamming codes and all the VNs are length-7 SPC codes. 
It is illustrated how the VN representations can heavily affect 
the code properties and how different VN representations can 
be combined within the same graph to enhance some of the 
code parameters. The analysis is conducted over the binary 
erasure channel. Interesting features of the new codes include the 
capability of achieving a good compromise between waterfall and 
error floor performance while preserving graphical regularity, 
and values of threshold outperforming LDPC counterparts. 



I. Introduction 

Recently, low-density parity-check (LDPC) codes [1] have 
been intensively studied due to their near-Shannon-limit per- 
formance under iterative belief-propagation decoding. It is 
usual to represent an LDPC code as a bipartite graph (known 
as a Tanner graph [2]), where the nodes are grouped into two 
disjoint sets, namely, the variable nodes (VNs) and the check 
nodes (CNs), such that each edge may only connect a VN with 
a CN. Here, a degree-g VN can be interpreted as a length-g 
repetition code, as it repeats q times its single information 
bit toward the CNs. Similarly, a degree-s CN of an LDPC 
code can be interpreted as a length-s single parity-check (SPC) 
code, as it checks the parity of the s VNs connected to it. 

Doubly-generaHzed LDPC (D-GLDPC) codes [3] (see also 
the previous work [4]) generalize the concept of LDPC codes. 
In a D-GLDPC code, a degree-s CN may in principle be any 
(s, h) linear block code, s being the code length and h the code 
dimension. Such a CN accounts for s — h linearly independent 
parity-check equations. Analogously, a degree-g VN may in 
principle be any (g, k) linear block code, q being the code 
length and k the code dimension. Such a VN is associated 
with k D-GLDPC code bits. It interprets these bits as its local 
information bits and interfaces to the CN set through its q local 
code bits. A D-GLDPC code is said to be regular (or strongly 
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regular) if all of its VNs are of the same type and all of its CNs 
are of the same type and is said to be irregular otherwise. We 
point out that the properties of a D-GLDPC code are heavily 
affected by the generator matrix used to represent its VNs, 
i.e., by the association between local input words and local 
codewords of any VN. On the other hand, the overall code 
properties do not depend on the representation of its CNs. 
Therefore, by type of a VN we mean its local input-output 
weight enumerating function (lO-WEF), while by type of a CN 
we mean its local weight enumerating function (WEE). Among 
irregular D-GLDPC ensembles, we call weakly regular any 
ensemble where all the CNs have the same WEE and where 
all the VNs have the same WEE but may have a different 
lO-WEE (i.e., they are associated with the same code, but are 
represented by a different generator matrix). Note that weakly 
regular D-GLDPC codes preserve the graphical regularity as 
all the VNs (resp. CNs) have the same degree. 

An analysis of the stability condition over the binary erasure 
channel (BEC) suggests that single parity-check (SPC) codes 
used as VNs can offer some benefits when codes with local 
minimum distance larger than 2 are employed as CNs [5]. In 
this paper we elaborate on this idea and propose an analysis of 
a class of strongly and weakly regular D-GLDPC codes where 
all the CNs have a local minimum distance larger than 2 and 
all the VNs are SPC codes. As proved in [6], the absence of 
CNs with minimum distance 2 is sufficient to have a growth 
rate of the weight distribution G{a) (see Section IIII-Bb such 
that a* = inf{a > 0\G{a) > 0} is strictly positive, which 
implies an exponentially small number of codewords of small 
weight linear in the block length. 

The threshold analysis over the BEC for any irregular D- 
GLDPC ensemble is reviewed in Section IIII-AI Two new 
results, namely, an expression for the growth rate of the weight 
distribution of any D-GLDPC ensemble with a uniform CN set 
(i.e., all the CNs are of the same type), and an efficient means 
of its evaluation based on a polynomial system, are presented 
in Section IIII-BI and Section IIII-CI In Section |IV] asymptotic 
and finite length analyses of a case study are presented. More 
specifically, strongly and weakly regular rate-1/2 D-GLDPC 
codes, where all the CNs are (7, 4) Hamming codes and all 
the VNs are length-7 SPC codes, are investigated. The (3,6) 
regular LDPC ensemble is used as a benchmark for the new 
class of codes, as it offers the best threshold over the BEC 
among rate-1/2 LDPC codes with a regular Tanner graph [7]. 

II. Preliminaries and Notation 

We define a D-GLDPC code ensemble 7W„ as follows, 
where n denotes the number of VNs. There are rir different 



CN types t e Ic = {1,2,- •• ,nc}, and different VN 
types t S Iv — {1,2,- •• ,12^}. For each CN type t G Ic, 
we denote by /ij, st and rt the CN dimension, length and 
minimum distance, respectively. For each VN type t £ ly, 
we denote by fcf, qt and pt the VN dimension, length and 
minimum distance, respectively. For t £ Ic, pt denotes the 
fraction of edges connected to CNs of type t. Similarly, for 
t € ly, Af denotes the fraction of edges connected to VNs of 
type t. Note that all of these variables are independent of n. 
The polynomials p{x) and A(a;) are defined by p{x) = 

Ete/, Pta;^'"^ and A(x) = T,tei„ ^tx'^''^- If ^ denotes the 
number of edges in the Tanner graph, the number of CNs of 
type t £ Ic is then given by Ept/st, and the number of VNs 
of type t G ly is then given by EXt/qt- Denoting as usual 
/p p{x) dx and X{x) dx by J p and / A respectively, we 
see that the number of edges in the Tanner graph is given by 
E = n/ J \ and the number of CNs is given by ni = E J p. 
Therefore, the fraction of CNs of type t £ Ic and the fraction 
of VNs of type t G /„ are given by 

It = ^1 and 6t = , (1) 



stj P 



respectively. Also the length of any D-GLDPC codeword in 
the ensemble is given by 



^UJ * SX^ qt 



(2) 



Note that this is a linear function of n. Similarly, the total 
number of parity-check equations for any D-GLDPC code 
in the ensemble is given by A/ = Ete/ EiilL-Ihl^ a 
member of the ensemble then corresponds to a permutation 
on the E edges connecting CNs to VNs. 

The WEF for CN type i G /c is given by A^*\z) = 1 + 
J2u=rt ^u^z""- Here A^*^ > denotes the number of weight- 
It codewords for CNs of type t. The lO-WEF for VN type 
t G is given by = 1 + ^tli Ellp, 

Here Bu y > denotes the number of weight-w codewords 
generated by input words of weight u for VNs of type t. Also, 
for each t G ly, corresponding to the polynomial B^^\x^y) 
we denote the set 5* = G : > 0}. 

The design rate of any D-GLDPC ensemble is given by 



R=l- 



(3) 



where for t £ Ic (resp. t E ly) Rf is the local code rate of a 
type-i CN (resp. VN). 

Throughout this paper, the notation e ~ exp(l) denotes 
Napier's number, all the logarithms are assumed to have base 
e and for < a; < 1 the notation h{x) = — a::log(a;) — (1 — 
x) log(l — x) denotes the binary entropy function. 

III. ASYMPTOTICS 

A. Asymptotic Threshold over the EEC 

An extrinsic information transfer (EXIT) chart [8] approach 
can be used to calculate the threshold over the BEC (denoted 
by £*) of any irregular D-GLDPC ensemble. Let e be the BEC 



erasure probability and I a be the average a priori information. 
The EXIT function of a type-t {qt, fcf) VN is given by 

<3t - 1 kt 



fl(i-iAYi: 



A 



.kt-z 



where a 



it) 

hz 



{qt - j) e 



s(t) 

qt—j.kt—z 



and 



e^*)j is the (g, h)-th un-normalized split information function 
[8] for a type-t VN. It is defined as the summation of the ranks 
over all the submatrices obtained by selecting g columns from 
the generator matrix Gf of a type-i VN and h columns from 
the identity matrix Ife^ (of order kt). 

The EXIT function of a type-t (st, ht) CN is given by 



where af^ = (st - j) e^*^_^- - (j 

g-th un-normalized information function for a type-i CN. It is 
defined as the summation of the ranks over all the submatrices 
obtained by selecting g columns from Gt- 

The EXIT function of the whole VN set is given by 
IeAIa,^) = Etei^^tl'sHlA^^), while the EXIT func- 
tion of the whole CN set is given by Ie,c{Ia) ~ 
J2tei Pt ^b^(^a)- We highlight that the threshold depends on 
the VN representations through the split information functions 
Cg /i of the VNs. On the other hand, the threshold does not 
depend on the representation of the CNs [5]. 

B. Growth Rate of the Weight Distribution 

The growth rate of the weight distribution (or spectral 
shape) of the irregular D-GLDPC ensemble sequence {A^„} 
is defined b}Q 



l)e?'' o 1 and el*'* is the 



G{a) ^ lim -logE^„ [7V„„] 



(4) 



where denotes the expectation operator over the ensem- 
ble Ain, and Ny, denotes the number of codewords of weight 
w of a randomly chosen D-GLDPC code in the ensemble A4n- 
Note that the argument of the growth rate function G{a) is 
equal to the ratio of D-GLDPC codeword weight to the number 
of VNs; by this captures the behavior of codewords linear 
in the block length, as in [9] for the LDPC case. Next, we 
formulate techniques for evaluation of the growth rate for a 
D-GLDPC ensemble A^„ with a uniform CN set, over a wider 
range of a than was considered in [6] (where the case a ^ 
was analyzed). 

Proposition 1: Consider a D-GLDPC ensemble with a uni- 
form CN set. Let A{z) be the WEF of each CN and B**) {x, y) 
be the lO-WEF of any type-t VN with t e ly. The growth 
rate of the weight distribution is given by 

G{a) = max ( V + Y{P) ] (5) 

'Note that using (2), we may also define the growth rate with 
respect to the number of D-GLDPC code bits as H(-f) 



limjv- 

G(7K) 



-^logE^^ [A^-^jv]- It is straightforward to show that -ff(7) 



where y ■■ 



1 fcf 



where a = iat)tei^, /3 = {Pt)tei^, P = Ete/„ 
maximization is subject to the constraints at > 0, m!^*'\at) < 
P < M'^^\at) for all t e /„ and 

R{a., (3) = ^ at = a . (6) 

te/„ 

The expression of Y{P) in Q is 



= log 



/A 



where zq is the unique positive real solution to 

A'izo) 



A{zo) 



■ Zo 



(7) 



while (for each t G ly) the expression of Xt^''\at, Pt) is 



(i3W(xo,t,z/o,«))' 



where St is defined in ([TJ and (xo.t,yo.i) are the unique 
positive real solutions to the pair of simultaneous equation^ 



dx 



{xQ^t,ya.t) 



5t 



(8) 



S(*)(xo^t,yo,t) ~ 5t 
{uji UJ2 ■■■ Wfct). we define the 



(9) 



dy 

Finally, letting u) 

function rn'^*^(a) = maxi^ Y^'^^^V^^^^'uji where V^^^ denotes 
the maximum local codeword weight associated with a local 
input weight i G {1,2,- •• ,kt} for a type-^ VN (i.e., the 
maximum j with G St), and the maximization is 

subject to the constraints uji > for all i = 1,2,- •■ ,kt, 
"-^i 1 and — o:- Also the function Af*^*^ is 

defined as M*^*)(a) = mirii^ J2iLi ^i^^^^i where U^*^ denotes 
the minimum local codeword weight associated with a local 
input weight i G {1,2,- ■• ,kt} for a type-t VN (i.e., the 
minimum j with (i, j) G St), and the minimization is subject 
to the constraints uji > for alH = 1, 2, • • • , kt, X^iili "^i 1 
and X^i'^i ^'^i = 

The proof of Proposition [T] is omitted due to space con- 
straints (it can be found in [10]). Note that O provides an 
implicit definition of zq as a function of p. Similarly, for any 
t E ly, ^ and (|9|l provide implicit definitions of xo.t and yo,t 
as functions of at and Pt- 

We deduce as a special case the growth rate for a strongly 
regular ensemble. 

Corollary 1: The growth rate of the weight distribution for 
a strongly regular D-GLDPC ensemble is given by 



G(a) 



max 

m(a)</3<M(a) 



log 



B{xo,yo) 
, ^oVo 



log 



Aizo)'. 



fp 



z, 







HPjx) 

/A 



(10) 



^The uniqueness of zq, and of XQ^t and j/o,t for eacli t G /„, is guaranteed 
by Hayman's formula (see for example [9, Appendix II]). 



where xq, j/q and zq are the unique positive real solutions to 
O together with 

11(^0,2/0) 



Bixo,yo) 



■ xo 



and 



^ixo,yo) 
B{xo,yo) 



yo = P- (11) 



C. Solution via Polynomial System 

We solved the optimization problem (|5]l using Lagrange 
multipliers. Letting 

Sia,f3)^Y.^t''Hat,Pt) + Y(P) 

and recalling (|6]l, at the maximum we must have 
dS{a,l3) _ dR{cx,f3) 
dat ^ dat 
for all t E ly, where n is the Lagrange multiplier. After some 
calculation, this equation simplifies to logxo.t — Vt G ly. 
We conclude that all of the {xo,t} are equal, and we may write 
xo,t = Xq for all t E ly. At the maximum, we must also have 

dS{cx,l3) _ dR{a,j3) 
dpt ~ ^ dpt 
for all t E ly. After some calculation, this equation simplifies 
to zoyo^t (""Ja") = 1 G We conclude that all of the 
{yo.t} are equal, and we may write j/o.t = yo for all t £ ly. 
Then the latter equation may be written as 

zoyo 



1 + Zoyo = 



PJX 



(12) 



Thus, for Uy > 1, the growth rate may be evaluated by solving 
numerically the {2ny + 3) x (2n„ + 3) system of nonlinear 
polynomial equations given by (O, (|9]l, (|6]l and (fT2l i. If all 
VNs are of the same type {Uy = 1), ^ is redundant and (|7|, 
©, ©, (O comprise a 4 x 4 system for numerical solution. 

IV. Asymptotic and Finite Length Analysis of a 
Rate-1/2 D-GLDPC Ensemble 

We consider as a case study a rate-1/2 ensemble where 
all the CNs are (7, 4) Hamming codes, and all the VNs are 
length-7 SPC codes. Three representations of the SPC VNs 
are considered. The first two are the systematic (S) and the 
cyclic (C) representations. The third one is what we call 
the antisystematic (A) representation, whose (/c x (fc + 1)) 
generator matrix is obtained from the generator matrix in S 
form by complementing each bit in the first k column^. 

The values of e*, evaluated as reviewed in Section lTlI-AI are 
provided in Table H] for the rate-1/2 strongly regular ensembles 
with VNs in A, S and C forms. The A form exhibits the worst 
threshold, while the C form achieves the best one. We observe 
a heavy dependence of the threshold on the VN representation. 
Note also that the strongly regular C form ensemble achieves 
a better threshold than the (3, 6) regular LDPC ensemble, for 
which we have e* — 0.429. Next, we searched for a weakly 
regular ensemble with an optimal mix (from an e* viewpoint) 

'Note that a (fc X (fc + 1)) generator matrix in A form represents a SPC 
code if and only if the code length g = fc + 1 is odd. For even fe + 1 we 
obtain a dmin = 1 code with one codeword of weight 1. 



TABLE I 

Asymptotic and Finite Length Parameters of Rate-1/2 
Strongly and Weakly Regular D-GLDPC Codes 





V 


c 


A 


S 


C 


WR 


Asymptotic Parameters 


e* 






0.332 


0.415 


0.450 


0.481 


a* (xlO-^ 


) 




11.7 


7.2 


10.7 


9.7 




Finite Length Parameters (n = 


500) 




CWD A 


9 


6 


21 


17 


27 


17 


CWD B 


9 


6 


33 


13 


21 


21 


CWD C 


9 


6 


27 


15 


33 


23 


CWD D 


9 


6 


30 


14 


27 


22 


CWD E 


11 


7 


37 


15 


29 


27 


CWD F 


12 


8 


27 


23 


22 


23 


a*n 






5.85 


3.60 


5.35 


4.85 



of the three VN representations. The problem consists of 
maximizing e* subject to all the CNs being (7, 4) Hamming 
codes, the VNs being length-7 SPC codes with A, S or C form 
and R— 1/2, where R is given in (O. The problem was solved 
using differential evolution (DE) [11]. The optimum weakly 
regular ensemble (denoted WR in Table is characterized by 
a fraction 0.578 of VNs in A form and a fraction 0.422 of 
VNs in S fornfl Its threshold (e* = 0.481) is quite larger 
than that of the (3, 6) LDPC ensemble. Remarkably, it has 
been obtained only by combining different VN representations, 
without affecting the regularity of the Tanner graph. 

Next, we evaluated G{a) for these D-GLDPC ensembles. 

Proposition 2: The lO-WEF for the length-(fc + 1) SPC 
codes ar 
S form: 

B{x, y) = \ [(1 + y)(l + xyf + (1 - y)(l - xyf] 
A form: (fcj even) 

+ y{x + yf y{x - yf] 

C form: 

The plots of growth rate for the three strongly regular 
ensembles are shown in Fig. [T] These are evaluated using the 
method described in Section IIII-CI which involves solution 
of a 4 X 4 polynomial system. The growth rate for the WR 
ensemble is also plotted in Fig. [T] based on the solution of 
a 7 X 7 polynomial system following the method described 
in Section IIII-CI Note that the same plots can be derived 
also by implementing (|5]l (or ( fTOl i) numerically. However, this 

*The fact that only the S and A forms are used in the optimal ensemble 
may be intuitively justified by the fact that the EXIT function of a SPC VN 
in S form matches tightly the EXIT function of a Hamming CN for values 
of I A close to 1, while the EXIT function of a SPC VN in A form matches 
tightly the EXIT function of a Hamming CN for values of Ij\ close to 0. 

^The proof is omitted due to space constraints and can be found in [10]. 
While the derivation of B{x, y) for the S and A forms is straightforward, for 
the C foiTn the formula has been obtained using a recursive approach. 
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Fig. 1. Growth rate curves for the rate-1/2 strongly and weakly regular 
D-GLDPC ensembles. 

approach is characterized by an intrinsic numerical inaccuracy 
due to the need to quantize the space over which the optimiza- 
tion is performed (finer-grained quantization comes at a price 
in computational speed, which becomes more pronounced as 
the optimization space dimensionality increases). The values 
of a* for the analyzed ensembles are reported in Table |T] 

In Fig. |2] the performance curves over the BEC are shown 
for rate-1/2 N = 3000 D-GLDPC codes from these ensem- 
bles. The curves correspond to iterative decoding, with MAP 
decoding at each node. The four simulated D-GLDPC codes 
have the same Tanner graph, the only difference being in their 
SPC VN representations. The Tanner graph was generated 
using the progressive edge-growth (PEG) algorithm [12], and 
is composed of to = 500 degree-7 CNs and n = 500 
degree-7 VNs. For the WR code, 289 VNs are in A form 
and 211 are in S form (these values target the optimized 
ensemble found earlier in this section). The performance 
curve labeled "LDPC" in Fig. |2] is that of an iV = 3000 
(3, 6)-regular LDPC code generated with the PEG algorithm. 
The waterfall region of the performance curves reflect the 
asymptotic thresholds presented in Table |I] with the WR code 
exhibiting the best waterfall performance, even if at the price 
of an error floor at CER ~ 10~^. We observe how the LDPC 
code is outperformed in the waterfall region by both the WR 
and the strongly regular C codes. Again, we observe how a 
modification in the VN representations can heavily affect the 
D-GLDPC code performance. 

An analysis of the error floor was carried out for the simu- 
lated D-GLDPC codes by collecting small size stopping sets 
encountered during the simulationo Six small-size stopping 
sets were collected, each one coinciding with a small-weight 
codeword (labeled 'CWD A to 'CWD F'). The weights of 
such codewords are reported in Table U where v and c denote 
the number of VNs and CNs involved in the subgraph induced 

*It is important to observe that a small-size stopping set (resp. small-weight 
codeword) collected for any of these D-GLDPC codes represents a small-size 
stopping set (resp. small-weight codeword) also for the other ones, although 
with a different size (resp. weight) due to different VN representations. 




0.50 



Fig. 2. Performance over the BEC of (3000, 1500) D-GLDPC codes and of 
a (3000, 1500) (3, 6) regular LDPC code. (CER: codeword error rate. BER: 
bit error rate, e: BEC erasure probability.) 




A, B, C, D E F 



compromise between waterfall and error floor performance, 
while preserving graphical regularity. 

The subgraphs induced by the codewords of Table H] are 
depicted in Fig. |3] (the codewords 'A' to 'D' share the same 
structure). With the exception of 'CWD E', which involves 
a weight-4 local codeword for one of the Hamming CNs, all 
the D-GLDPC codewords are associated with local codewords 
of minimum weight at the nodes, i.e., weight-3 codewords at 
the CNs and weight-2 codewords at the VNs. Interestingly, all 
these subgraphs share a similar structure, composed of a layer 
of VNs interconnecting two cycles (for 'CWD E' one cycle 
and one structure composed of two overlapping cycles). 

V. Conclusion 

Motivated by the search of new coding schemes with 
iterative decoding, a class of D-GLDPC codes with Hamming 
CNs and SPC VNs has been analyzed over the BEC. The 
asymptotic analysis has been conducted using both EXIT 
chart and a proposed tool to evaluate the growth rate of 
the weight distribution. Interesting features recognized from 
the analysis of a rate-1/2 ensemble include the capability of 
achieving a good compromise between waterfall and error 
floor performance while preserving graphical regularity, and 
values of threshold outperforming LDPC counterparts. 



Fig. 3. Subgraph induced by the codewords listed in Table|l] (Circles: VNs. 
Squares: CNs.) 

by the codeword, respectiveljQ. For each code, the smallest 
among such weights is an estimate of the minimum distance 
and is reported in bold in Table |T] We observe that each of 
these estimates is significantly larger than the corresponding 
value a*n for n — 500, revealing the beneficial effect of a 
PEG-based construction. On the other hand these estimates are 
significantly smaller than the value a*n — 69 for the (3,6) 
LDPC codq3, suggesting that the new codes offer worse error 
floor properties than the regular LDPC counterparts. 

The estimates of the minimum stopping set sizes were used 
to calculate a prediction of the error floor Letting 6 be the 
minimum stopping set size and assuming a multiplicity one 
for stopping sets of minimum size, the probability of decoding 
error fulfills 

Pe > (13) 

as the RHS of ( fT3T l is the probability that the starting erasure 
pattern includes the stopping set of minimum size. As depicted 
in Fig. |2]the lower bound (fT3] i is very tight in the error floor 
region for the strongly regular S code and WR code. Moreover, 
it predicts an error floor lower than CER = 10^* for the 
strongly regular C code which, therefore, exhibits a quite good 

'The subgraph induced by a codeword is composed of the edges of the 
Tanner graph cairying a '1' for the given codeword, and the VNs and CNs 
connected to these edges. Note that, in the subgraph, the edges incident on a 
VN or CN are associated with a valid local codeword for the node. 

*From [1] we have a* = 0.023 for the (3, 6) LDPC ensemble, so that 
a*n = 0.023 X 3000 = 69 (for the LDPC code we have n = N = 3000). 
This value represents a lower bound on the LDPC code minimum distance. 
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